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Abstract.  The  two  sided  Lanczos  algorithm  is  known  to  suffer  from  serious 
breakdowns.  These  occur  when  the  associated  moment  matrix  does  not  permit 
triangular  factorization.  We  modify  the  algorithm  slightly  so  that  it 
corresponds  to  using  a  2><2  "pivot"  in  triangular  factorization  whenever 
a  1x1  pivot  would  be  dangerous.  The  incidence  of  breakdown  is  greatly 
reduced.  The  price  paid  is  that  the  tridiagonal  matrix  produced  by  the 
algorithm  now  has  bumps  whenever  a  2x2  pivot  is  used.  : 
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1 .  THE  LANCZOS  ALGORITHM  AND  ITS  BREAKDOWN 


The  most  popular  way  to  obtain  all  the  eigenvalues  of  a  nonsymmetric 
n  x  n  matrix  B  is  to  use  the  QR  algorithm  which  is  readily  available 
in  most  computing  centers.  As  the  order  n  increases  above  100  the  QR 
algorithm  becomes  less  and  less  attractive,  especially  if  only  a  few  of  the 
eigenvalues  are  wanted.  This  is  where  the  Lanczos  algorithm  comes  into  the 
picture.  It  does  not  alter  B  at  all  but  constructs  a  tridiagonal  matrix 
J  gradually  by  adding  a  row  and  column  at  each  step.  After  several  steps 
some  of  the  eigenvalues  9^  of  J  will  be  close  to  some  eigenvalues  A^ 
of  B  and  by  the  nth  step,  if  nothing  goes  wrong,  0.  =  A,. ,  i  =  l,...,n. 
This  description  is  correct  in  the  context  of  exact  arithmetic.  Unfortunately 
things  can  go  wrong,  even  in  the  absence  of  rounding  errors.  For  the 
relation  between  these  troubles  and  orthogonal  polynomials  see  [2]. 

In  order  to  discuss  these  troubles  we  must  give  some  details  about 
the  algorithm.  Let  be  the  k  x  k  tridiagonal  produced  at  step  k 
of  the  algorithm.  There  are  infinitely  many  tridiagonal  matrices  similar 
to  B  and  Jp  is  one  of  them.  Thus  for  some  matrix  Qp  =  (q^ , . . .  ,qn)  we 
have 

In"’ “A,  *  J„  <’> 


It  simplifies  the  exposition  considerably  to  introduce  a  redundant  symbol 
and  write  Pp*  instead  of  Qp’^  .  The  superscript  *  indicates  conjugate 
transpose. 

Let  Pp  h  (p1,...,pn)  and  replace  (1)  by  two  separate  relations 


Pn^n  -  1  • 
Pn*BQ  -  Jn  . 


(2) 

(3) 
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We  mention  in  passing  that  when  B*  =  B  =  A  then  we  can  arrange 

that  Pn  =  Qn .  The  difficulty  we  are  going  to  describe  cannot  occur 

when  P  =  Q  . 
n  n 

By  equating  elements  on  each  side  of  BQ  =  Q  J  and  P*  B  =  J  P* 

in  the  natural  increasing  order  we  shall  see  that  B,  and 

essentially  determine  all  the  other  elements  of  P  .  Q  ,  and  J  .  On 
J  nnn 

writing 

a1  Y2 
$2  a2  T3 

b2 

we  find 

P,  *Bq1  =  a]  , 

and 

Bq1  =  q]a1  +  q^  ,  P,*B  =  +  Y2P2*  * 

Hence  q2  and  p £  are,  respectively,  multiples  of  "residual"  vectors 
r1  =  Bq1  -  q1a]  ,  s-,*  =  p1*B1  -  a^*  . 

Furthermore,  since  P2*Q2  =1  by  (2),  we  have 

=  y2^32*c*2^2  =  y2^2  =  w2 ' 
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If  u>2  t  0  and  is  given  any  nonzero  value  then  y^,  q2,  and  p^* 

are  all  determined  uniquely.  A  good  choice  is  82  =  2  *  ’ 

The  general  pattern  emerges  at  the  next  step,  on  equating  the  second 

columns  on  each  side  of  BQ  =  Q  J  and  P*  B  =  J  P*  , 

n  n  n  n  n  n 

P2*Bq2  =  a2. 


Bq2  q-| Y2  ^2^2  "**  »  P2*^  ”  ^2^1*  +  ^2^2*  ^  T 3 P 3 * • 

At  this  point  we  can  compute  the  “residual"  vectors 

r 2  =  Bq2  ~  q-jY2  -  q2a2 »  s2*  ~  P2*®  —  ^2^1*  "  ^2^2* 

and 


u3  =  s2*r2  =  ^3®3  ' 

If  0)3  /  0  and  B3  is  given  any  nonzero  value  then  y3»  q3>  and  p3* 
are  all  determined  uniquely.  And  so  it  goes  on  until  some  vanishes. 

This  is  the  Lanczos  algorithm.  It  must  terminate  at  the  n^  step  with 
ujn+j  =  0  but  it  may  stop  sooner. 

Premature  termination  at  say  step  j  (<  n)  can  occur  in  two  ways: 

(I)  either  r.  =  0  or  s  .*  =  0*  or  both, 

3  3 

or 

(II)  r.  f  0,  Sj*  f  0*,  but  0)j+1  =  0. 
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In  the  1950‘s  when  the  Lanczos  algorithm  was  regarded  as  a  way  to 

compute  Jn  Case  I  was  regarded  as  a  mild  nuisance.  If  r.  =  0  then 

any  nonzero  vector  orthogonal  to  can  be  chosen  as  q^  . 

Similarly  s^  =  0  gives  ample  choice  for  Pj+i  ■ 

Today,  regarding  Lanczos  as  a  way  to  find  a  few  eigenvalues  of  large 

B  it  seems  better  to  stop  at  Case  I  in  the  knowledge  that  every  eigenvalue 

of  J.  is  an  eigenvalue  of  B.  If  more  eigenvalues  are  wanted  then  it  is 
J 

best  to  start  the  Lanczos  algorithm  afresh  with  new,  carefully  chosen 
starting  vectors  and  . 

The  real  trouble,  which  cannot  occur  when  B  =  B*  =  A,  is  Case  II. 
Wilkonson  calls  this  a  serious  breakdown.  There  seems  to  be  no  choice  but 
to  start  again  but  no  one  has  been  able  to  suggest  a  practical  way  to 
choose  the  new  q^  and  p^*  so  as  to  avoid  another  wasted  run  of  the 
algorithm.  That  is  why  the  Lanczos  method  has  not  been  used  much  when 
B*  ^  B.  In  this  article  we  propose  a  modification  of  tne  algorithm  which  greasy 

reduces  the  occurrence  of  Case  II.  The  price  paid  for  this  convenience 
is  that  J  is  not  quite  tridiagonal.  There  is  a  small  bump  (or  bulge)  in 
the  tridiagonal  form  to  mark  each  occurrence  (or  near  occurrence)  of 
Case  II. 


Example  1 .  (No  breakdown) 

B  =  diag(2,3,4),  q]*  =  1(1,1 ,1),  P^  =  j(l,2,l) 

Step  1  '•  ot^  “  3,  cu 2  "  y*  ^2  ~  ^  *  Y2 

q2*  =  1(-1,0,1),  p2*  =  (-1,0,1). 

Step  2 1  ctj  *  3  j  (1)2  *  2* »  $2  *  ^  "  2" * 

q3*  =  1(1, -1,1),  p3*  =  1(1, -2,1). 
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Step  3:  a j  =  3,  =  0. 


2.  THE  TWO  SIDED  GRAM-SCHMIDT  PROCESS. 

The  serious  breakdown  described  above  is  not  limited  to  the  Lanczos 
algorithm.  It  can  occur  in  any  attempt  to  use  the  familiar  Gram-Schmidt 
process  to  produce  a  biorthogonal  (or  biorthonormal)  pair  of  sequences. 
Our  modification  of  Lanczos  seems  more  natural  in  such  a  context. 

Let  F  =  (f-|,...,fn)  and  G  =  (g^,...,gn)  be  given  real  nonsingular 

n  by  n  matrices.  In  other  words  {f ^ . fn>  and  {g1 . gp}  are  each 

a  basis  for  the  vector  space  fP  of  column  n-vectors.  We  want  to  produce 
a  new  pair  of  bases  (q.|,...,q  }  and  {p^,...,pn}  such  that 

pn*Qn  -  =  diag(^,...,«n) 

and,  for  each  j  =  1 , . . .  ,n 

span  {q1,...,qj}  =  span  {f^ .... *f j) » 
span  {p1,...,pj}  =  span  {g1 ....  .g^ } - 

The  Gram-Schmidt  process  dictates  that  at  step  j 
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All  goes  well  until  jj.  =  0  for  some  j.  This  can  happen  despite  the 
fact  that  F  and  G  are  nonsingular. 


Example  2. 


1  1  1 


1  1  2 


1  1  1 


F  =  1  2  1  ,  G  =  0  0  1 


0  1  0 


Step  1  :  q-j  -  f-j  >  p-j  =  g^  ,  lxj  =  1  . 

Step  2:  q2*  =  (0,1  ,0),  p2*  =  (-1,0,1),  co2 


One  remedy  for  the  case  u>.  =  0  is  quite  natural.  If  p.  f  0  then 

J  J 

recompute  q^.  using  f^ ,  in  place  of  f..  If  this  fails  too  then  try 

fj+2,  and  so  on.  If  F  is  nonsingular  there  must  be  some  i  >  0  such 

that  fj+i  will  yield  a  nonzero  value  for  . 

Here  is  a  formal  proof  of  tne  last  remarKs.  Let  q.'  1  denote  the 

J 

vector  obtained  by  using  f,  instead  of  f.,  i.e., 

*  J 


<k>  .  fl. .  £ 


k  i=l 


Lemma:  If  p.  ^  0  and  p.*q.  =0  for  k  =  j,j+l,...,n  then  F  is 

J  *3  J 

singular. 

Proof:  By  construction  Pj*^^  =  0  for  i  <  j. 

Hence  p^  J_  span(f1 , . . .  ,fj_1 ) . 

Thus  0  =  p.*q/k)  =  p,*f,  -  0,  for  all  k  >  j  . 

Thus  p.  j_  span(f1,...,fn). 

Thus  p.*F  =  0  and  F  is  singular. 

J 


If  the  modification  succeeds  the  first  time  then  only  one  property  of 
Gram-Schmidt  has  been  sacrificed,  namely 
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span  (q1 » . . .  .q  )  /  span  (f-, , . . .  ,f^)  . 

Moreover,  if  no  further  breakdown  occurs  then 

span  ( q-j , . .  .  ,q.j )  =  span  (f1,...,fi)  for  i  >  j. 

In  many  applications  this  price  is  well  worth  paying. 

The  Lanczos  algorithm  can  be  regarded  as  the  two  sided  Gram-Schmidt 
process  applied  to  the  columns  of  the  special  matrices 

K  =  <n  E  (q1,Bq1,32g1,...,Bn“1g1), 

and 

K*  =  Kn*  E  (p1*,p1*B,p1*B2,...,p*Bn_1)  . 

We  will  not  establish  this  result  but  content  ourselves  with  stating  the 
key  observation,  namely 

Span  (q1 ,q2,.,.,q^,Bqj)  =  Span  (q1 , . . .  .q^  ,BJq1 )  . 

The  K  matrices  are  called  Krylov  matrices  and  the  pleasant  fact  is  that 
most  of  the  work  required  for  general  Gram-Schmidt  disappears  in  this  case 
because 

p.*Bq.  =  0  for  i  <  j-1  . 

•  3 


Thus  the  general  formula 
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Vi  ■  B\  -  it 

collapses  to 

qj+1  =  8qj  -  qjaj  -  qj.13j_1. 

3.  TRIANGULAR  FACTORIZATION  OF  MOMENT  MATRICES 

Consider  again  the  matrices  K  and  K*  defined  in  the  previous  section. 
Note  that  the  (i,j)  element  of  K*K  is  ( p1 *B1 _1 ) (BJ_1 q. ) ,  so 

K*K  =  M  =  M( p-j , q-j ) ;  mi+1>j+]  =  P1*B1+jqr 

In  order  to  use  this  observation  we  need  some  basic  facts  about  the 
lanczos  algorithm  (see  [4],  [5],  or  [7]).  If  it  does  not  breakdown  it  oro- 
duces  matrices  P  and  Q  such  that 

Vi  '  xj(B)V<  s,). 

pj+,  ■  xj(b*)p,/(  n’  y,). 


where 


Xj+1U)  -  (t  -  a j ) X j  ( t )  •  co j X j _ i  ( ^ )  > 

and  Xi  is  the  characteristic  polynomial  of  the  tridiagonal  matrix  J.. 
J  J 

In  other  words,  for  each  j  ,q  is  a  linear  combination  of  the  first  j 

J 

columns  of  K  while  Pj  is  the  same  linear  combination  of  the  columns 
of  K,  up  to  scaling.  This  can  be  expressed  compactly  in  matrix 


i  i  II I  111  'Mil  1—ai^li 
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notation  as 

Q  =  KL‘*lf 1 , 
P  =  KL_*n_1  . 


(4) 


Here 


n  =  di a g( 1 ,S2,3233,  ...  ) , 
ir  =  diag(  1  >Y2»y2y3>  •  •  •  )  . 

and  L  is  the  unit  lower  triangular  matrix  such  that  L"*  =  (L~^)*  has 

the  coefficients  of  above  the  diagonal  in  the  jt^1  column. 

J 

Using  (4)  we  can  rewrite  I  =  P*Q  as 
I  =  P*Q  =  (.-'1L~1K*)(KL"*n''1  ) 

i  .e. 

M  =  Lr.L*  ,  (5) 

where 

—  HIT  —  d i  ag ( 1  , cu2  * t^2^3  *  *  *  * 9 ^2 '  *  * ^n  ^ " 

This  result  is  not  new  (see  Householder,  [2  ])  but  it  is  worth 
emphasizing  that  the  product  ^...uk  is  the  ith  pivot  which  arises  in 
performing  Gaussian  elimination  on  the  moment  matrix  M  associated  with 
B,  q]  and  p] . 

When  B  f  B*  the  moment  matrix  M  need  not  be  positive  definite 
and  so  triangular  factorization  need  not  be  stable,  even  when  M  is 
nonsingular. 


|k  ■  ■ 


The  best  known  remedy  for  instability  is  to  use  some  form  of  row  or 
column  interchange  whenever  an  w.  is  too  small.  The  standard  "partial 
pivoting"  strategy  is  not  available  because  a  whole  column  of  M  is  not 
known  in  the  middle  of  the  Lanczos  process.  An  alternative  remedy  is  to 
enlarge  the  notion  of  a  "pivot"  to  include  2  *  2,  or  even  larger  sub¬ 
matrices.  This  idea  is  discussed  in  Parlett  and  Bunch,  1971,  [4],  It  is 
the  basis  of  our  method.  Whenever  a  2x2  pivot  is  used  the  tridiagonal 
J  bulges  outwards  temporarily. 

In  the  context  of  the  Lanczos  process  our  remedy  for  a  tiny  w. 

J 

requires  us  to  compute  at  the  same  time  as  q. ,  and  p  ^  at  the 

same  time  as  p..  The  formulas  for  these  "Lanczos"  vectors  are  somewhat 
different  from  the  standard  ones.  We  call  our  modification  the  "look-ahead 
Lanczos  algorithm"  because  it  computes  at  the  current  step  some  quantities 
wnich  are  not  usually  needed  until  the  next  step  in  the  standard  Lanczos 
process . 


4.  THE  NEXT  PIVOT 

The  decomposition  M  =  L.TL*  is  never  found  explicitly.  In  order  to 
make  use  of  the  idea  of  using  a  2x2  pivot  it  is  necessary  to  determine 
the  top  left  2x2  submatrix  of  what  would  be  the  reduced  matrix  in  the 
trangular  factorization  of  M.  These  three  numbers  can  be  determined  from 
the  information  available  in  the  Lanczos  process. 

After  i-1  steps  of  the  standard  algorithm  we  have 


ri  -  Bqi-i  '  Vlai-l  '  qi-2Yi-l  =  Xi_-](B)q1/(B2...Bi_1), 
si*  E  p*i-l8  -  ai-lp*i-l  -  6i-lp*i-2  =  pl%-l<B>/<V"*1-l 


), 


Instead  of  normalizing  r..  and  s.* 


to  get  q..  and  p^*  we  can  look 
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ahead  not  to  the  next  Lanczos  vectors  q^.p.  +i*  but  to  any  vectors 

ri+1’  si+i*  such  that 

the  plane  (ri>ri+1)  =  the  plane  (q^q^), 
the  plane  (s.*,s  *)  =  the  plane  (p.*,p1+1*) . 


The  simplest  choice  for  r^+^  and  s.^*  is 


ri+l  -  Bri  -  «1-1  *t ■ 

w 5  *t*»  -  “ipf.r  ■ 

The  coefficient  uk  ensures  that  r .  is  orthogonal  to  all  the  known 
p's,  namely  P-j  »  and  a1so  that  si+1  is  orthogonal  to  q^,...,qi 

Note  that  if  we  choose  as  q,.  any  vector  in  the  plane  (r^.r^^) 
other  than  a  multiple  of  r.  then  q..  will  be  of  the  form 

qi  =  ^(B)q1 /( 62 - • •8i ) 

with  a  monic  polynomial  of  degree  i  instead  of  the  usual  x-j  -\  • 
Moreover  it  will  be  possible  to  choose  to  be  of  the  same  form,  using 

a  different  ip  but  of  the  same  degree  i.  This  is  a  mild  generalization 
of  the  basic  Lanczos  algorithm.  The  degrees  of  the  new  Lanczos  polynomials 
are  still  nondecreasing  but  they  do  not  always  go  up  by  one  at  each  step. 
Before  making  such  a  choice  we  compute 


HrJ.ls..  il ,  and  cos/,  (r-.s..)  =  r..  !s* 
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“l+l  =  si+l*ri+l  ’  01  =  si*ri+l  =  Si'+Iri  * 

lri+l 1  »lsi+i*1 »  cos  Si+1)  =  !^i+l  1/  ri+r!';  si+l 

Our  choice  of  q,. ,  etc.  must  be  based  on  the  matrix 


9i  “i+l 


It  turns  out  that  the  top  left  2x2  submatrix  of  the  reduced  matrix 
in  the  associated  triangular  factorization  of  M  is  (uu-.-uj-  -|)W^, 
but  no  use  will  be  made  of  this  fact. 

If  both  ^  -  0  ana  W.  is  singular  then  more  drastic  measures  are  needed 
to  salvage  the  algorithm.  We  will  not  pursue  this  case  here.  When  -  0 
then  the  standard  Lanczos  algorithm  breaks  down.  Yet  if  W^  is  invertible 
it  is  easy  to  choose  suitable  q’s  and  p's  so  that  the  algorithm  can 
proceed . 

The  equations  above  can  be  condensed  into  block  form. 


(ri  ,ri  +1 )  =  B<<»i-vri>  '  (<*i-i’ri>(  V  0  )  '  ^-2^1-1  0)  > 


CXj  ,  0 


^si’si+l‘l*  =  ^  pi -1  »si )  *8  -  (  u,.  0  )  (pi-l  ,si  **  "(  0  )  pi-2*’ 


d«f  /  ei 


Wi  =  (5i’Si+l)*(ri*ri+1)  =  91 


Various  factorizations  of  yield  various  selections  for  new  q's  and 
p's.  These  are  discussed  below.  We  write  <I>  for  wi+i  and  e  for 


Let  us  drop  the  subscript  i  and  write 

U  =  VU. 


1 .  LU  factori zation 


/  i  o\  /u>  e 

v  =(  1  ,  u  -  (  2 


,0  w-8  / U)> 


2.  UL  factorization 


/  io-6  /u)  ,  3\  /I  0 

V  \  0  w/  \0/£  1 


3 .  QR  factorization 


•■(;  :r-  • 


’  9(u+w) 

0  (ux2-e2) 


-1  _  ,  2,fl2,+  1/2 

T  ,  T  =  (to  +0  ) 


4  .  LU  with  interchange 


/  oj/  0  1  \  /  0  N 

V  =(  ),  U  =( 

\  i  o  /  \o  e-ww/0/ 


5.  THE  SMALLEST  ANGLE 

In  order  to  determine  the  smallest  angle  between  the  planes  (r1-»r1-+i) 
and  (si ,s.+i )  it  is  best  to  orthonormal ize  the  bases.  Let  the  result  be 

(Fi*?i+l)  and  <s'i*9i+l)- 

when  the  Gram-Schmidt  process  is  used.  The  (matrix)  angle  between  the  two 
planes,  call  it  4>,  is  defined,  see  [1]»  by 
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cos  $  =  (si,gi+1)*(ri,fi+1). 

Let  the  Singular  Value  Decomposition  of  cos  $  be 

cos  <t  =  U  E  V* 

where  E  =  diag(0^  ,0^)  and  0^  The  appropriate  definition  of  new 

Lanczos  vectors  to  ensure  the  smallest  i(q.  ,p^)  is 

<vW  ■  <v<WV£',/2  • 

Comments .  No.  1  corresponds  to  the  standard  Lanczos  algorithm.  No.  2 
corresponds  to  simply  swapping  s^  with  s.*B  and  r.  with  Bri  in  the 
Lanczos  algorithm.  One  consequence  is  that  the  J  matrix  will  bulge  out 
of  tridiagonal  form  on  both  sides.  No.  3  is  always  stable  and  keeps  the 
bulge  on  one  side.  The  same  advantage  accrues  from  No.  4  and,  as  might 
be  expected,  is  somewhat  simpler  than  No.  3. 

We  want  to  use  No.  1  whenever  this  is  a  reasonable  strategy  but 
when  oj  is  zero  or  very  small  it  is  vital  to  choose  one  of  the  other 
procedures.  We  have  tentatively  chosen  No.  4  on  the  grounds  of  simplicity. 
The  interesting  question  which  we  now  address  is  when  to  use  No.  4. 

Initial  success  with  No.  4  has  not  encouraged  us  to  implement  No.  5. 

Criterion  for  Choosing  a  2*2  Pivot 

When  Factorization  No.  4  is  chosen  then 
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(Wl>  =  ^VW  />  -“i*l/6i\ 

I  J  ' 

(Pfpi*l>  ’  (si*l'sf)  /'  “V9i\ 

\°  '/  ' 

to  within  scalar  multiples.  Note  the  interchange.  Hence 


cos  ,  (qrp,.)  -  COS  L  (rrsi+1)  =  -ypr^-  =  ^ 

(9.-u).u).+1/0.) 

COS  L  (qi+1,P1+1)  =  - 

ri+l"/wi+l\  ri  ‘  si"/V\sl+l 

\  e<  )  \V 


Both  these  numbers  are  readily  computable,  without  forming  the  new  vectors, 
provided  that  r.+^*r^  and  s<+1*s^  are  known.  The  choice  between  No.  1 
and  No.  4  reduces  to  a  comparison  of 


4>1  =  | cos  A(ri  ,si )  j  and  <P2  =  min  { | ,  |<J>.+1 1 }  . 

If  <  100c  and  <pz  <  100c  then  our  algorithm  stops  and  reports  failure. 
Otherwise,  for  a  given  bias  factor  we  proceed  as  follows:  if  <f>^  _>  (bias 
factor)  $2  then  use  standard  Lanczos  else  use  Factorization  No.  4.  When 
bias  =  0  we  recover  the  standard  algorithm.  Currently  bias  =  1. 

Sometimes  the  test  declares  that  a  standard  Lanczos  step  is  safe.  In 
such  cases  and  i>2  are  not  used  and  their  computation  may  be  regarded 
as  an  overhead  of  4  inner  products.  No  matrix-vector  products  are  "wasted". 
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The  4  inner  products  arise  as  follows.  We  need 


We  may  regard  the  second  and  third  terms  on  the  right  hand  sides  as  the  price 
to  be  paid  for  knowing  that  a  standard  Lanczos  step  is  safe.  Observe  that 

r. +i  and  s*  are  not  computed. 

Let  us  write  R..  =  (r.  ,ri+1 ) ,  S.  =  (s^.s.^). 

Then  the  look-ahead  part  of  the  algorithm  comprises  the  computation  of  r.+^, 

s . +.|  and  the  unknown  elements  of  S^R,.  ,  R^  ,  and  S.jS.  .  Before  specifying 
the  algorithm  we  describe  the  bumpy  tridiagonal  matrix  J  . 


the  Projection  of 


The  standard  (biorthogonal )  Lanczos  algorithm  produced  a  tridiagonal 
matrix  J.  by  the  end  of  step  j  .  The  look-ahead  algorithm  produced  a  block 

J 

tridiagonal  matrix,  also  called  J.  ,  and  written  as 

J 
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The  diagonal  blocks  are  capital  a's  and  the  subdiagonal  blocks  are  capital 
B's  .  The  A.j  are  either  1  by  1  or  2  by  2  and  the  Bi  and  are 
shaped  conformably.  For  convenience  in  finding  eigenvalues  of  J  we  have 
forced  each  B.  to  have  one  of  the  following  four  forms 

[+]  »  [0+]  ’  {^oj  ’  j^O  0  5 

where  +  stands  for  a  positive  quantity.  It  turns  out  that  each  also 
has  rank  one. 

The  left  and  right  Lanczos  vectors  will  be  grouped  by  step  and  written 
as  P]*  ,  P2*  ....  .  Pj*  and  Q-j ,  ...  ,  where  sometimes  is  n  by  1 
and  sometimes  n  by  2  .  We  collect  the  vectors  together  in  Q  =  (Q,  ,  . . .  ,  Q . ) 

1  J 

A  A  A 

and  P-  =  (P,  ,  ...  ,  P.)  .  The  matrix  Q.P.*  is  the  projector  matrix  onto 

J  *  J  J  J 

the  left  and  right  Krylov  subspaces  and  B's  (oblique)  projection  into  tnem 
is  defined  by 


(Vj*> B  ">jY>  -  VjY  ■ 

Thus  Jj  is  the  representation  of  this  projection  with  respect  to  the  bases 

A  A 

given  by  the  rows  of  P-*  and  the  columns  of  Q.  . 

J  J 

Please  not  that  j  is  not  the  order  of  0.  . 

J 


6.  The  Look-ahead  Lanczos  Algorithm  (called  LAI  hereafter) 

We  have  chosen  our  notation  to  camouflage  the  complications  caused  by 
the  fact  that  each  step  may  be  either  a  single  one  or  a  double  one.  It  turns 
out  that  quantities  are  computed  in  a  somewhat  different  order  and  way  from 
the  standard  two  sided  Lanczos  algorithm  (called  LAN)  and  the  reader  may 


t 
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find  the  differences  loom  larger  than  the  similarities.  We  have  found  it 
helpful  to  think  that  step  i  takes  certain  residual  matrices  R^  and  , 
decides  how  to  modify  them,  then  turns  them  into  the  new  and  P^*  ,  and 

finally  computes  part  of  the  next  set  of  residual  matrices. 

It  is  convenient  to  define  the  index  i  by 

£  =  1  +  order  (Qi_1 )  . 

Thus  by  the  end  of  step  i  we  shall  have 

q£  ,  if  step  i  is  single, 

(qi>  ’  q£+i )  *  if  steP  is  double  . 

Similarly  for  P>  .  However,  in  all  cases,  R..  =  and  Si  = 

In  describing  LAI  we  call  any  computation  involving  n  scalar  multi¬ 
plications  or  divisions  a  vector  operation  and  abbreviate  it  as  1  v.op.  . 
The  algorithm  requires  that  the  user  supply  a  subroutine  (or  subroutines) 
for  computing  Bx  and  y*B  from  x  and  y*  .  The  cost  of  these  operations 
will  be  problem  dependent.  We  stress  that  this  is  the  only  way  in  which  B 
enters  the  process. 

Step  i  of  LAL 

On  hand  are  P^_*  ,  (both  are  null  when  i  -  1)  ,  and  zi  which  is 
a  multiple  of  column  1  of  (z^  =  1)  ,  together  with  non  null  residual 
vectors  r^  ,  s*  and  scalars  u>  =  =  s*r,  ,  ||  r^  ||  ,  ||  s*  ||  . 


(a)  Complete  Rf  =  (r£,r£+])  and  S*  =  (s£ts£+])* 


Vi  ■  6rt  -  • 


V*  '  SJe  '  "*V* 


(2  matrix-vector  products  +  2  or  3  vector  ops.) 


(b)  Compute  needed  inner  products. 


9  =  S£rJi+l  =  SJM-*r«,  ,  ^  '  S£+*r£+l  ’ 


r£r£+l  *  S£S£+1  ’  'I  r£+l 


.  s, 


(6  v.  ops. ) 


(c)  Compute  cosines  of  important  angles. 


*1  *  cos  4(r£  ,  s4)  =  u  /((  r£((  •  (|  s£((  ,  4>2  =  0 


if  9=0  then  go  to  step  2,  otherwise 


Tl  =  co  /  9  ,  t2  =  w  /  8  . 

II  V,ll  ■  411  rl+,||  2  -  2r2(rjrl+1)  ♦  i|||  r,||  2) 

ll?tll  *  All  st||  2  -  2t,(s*sul)  ♦  t2  ||swl|  2) 

*1  a  cos4(r£,  s*+1)  =  6/ 1|  r£|j  .||  s*+1l|  , 

^2  =  cos4(r£+1,s*)  =  (wt2  -  9)  /  II  r£+1ll  'll  s£  ||  , 

$2  '  min  {|i|/-||  ,|K/2I>  • 

(0  v.ops. ) 
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Single 


ri  =  ziYX. 


Di  p2, 

(0  v.  ops. ) 


rJl+l  '  rS.+1 

SX,+1  2.+1 

(2  v.  ops.) 


Double 

complete 

ri  ,  Bi 

M  =  Z 

(0  st) 

Bi=  ( 

(0  v.  i 

form  new 

residuals 

/  6*  , 

Vl 

/  Yj  . 

SI+2  = 

,-1 


or 


c  :■) 


=  (0  1)  (P*  B  -  B.Pi.f)  , 

(2  matrix-vector  products 
+  2  or  3  v.  ops. ) 


(e)  form  A. 


=  1/Ti 


(0  v.  ops. ) 


A. 

n 


t2 


L^+i 


p t  BVi 


Pi  BVl 


(1  v.  op.) 


r 


(f)  orthogonal ize 


Double 


Vi  ^  Vl  ■  q£  . 


V*  ^  V*  "  * 


(2  v.  ops. ) 


Vl  ^  V2  "  QiAi  V  1  /  ’ 

VI  *  v!  -  <’ 01  VT  • 

(4  V.  ops.) 


(g)  inner  products  for  next  step 


+  a? II  r,||  2)1/2/S,  compute 


'•  Vl 


-  2v*  Vi +  v v  ;  v 

-  2ct:s*s;  +  1  +  sf  !|  s,Ji  2)1/2/Y? 


1+2 :1  ’ 


'*’*,+ 1  >+  / 


n  si+2" 


(0  v.  ops. ) 


(h)  reset  z 


V2  =  V2  V2 


(3  v.  ops. ) 


2t+i =  01 


Vi  y-uj  •  /(®2,  ^1+1  ^ 


end  of  step  i  of  LAL. 


Total  Cost  of  step  i  . 


Look-ahead:  2  matrix-vector  prods  +  9  v.  ops. 

Single  step:  6  v.  ops. 

Double  step:  2  matrix-vector  prods  +  16  v.  ops. 

For  comparison  we  note  that  the  standard  two  sided  Lanczos  algorithm  which 
keeps  ]|  p*  I]  =  ||  q„  ||  requires  2  matri x-vectors  prods  +  10  v.  ops. 

There  are  3  different  ways  of  advancing  two  steps 

LAN  :  4  matrix-vector  p's  +  20  v.  ops., 

LAL,  1  double  step  :  "  +25  v.  ops., 

LAL,  2  single  steps:  "  +  30  v.  ops. 

The  bias  factor  in  Step  3  is  a  programming  device  which  permits  LAL 
to  implement  standard  Lanczos  (bias  =  0)  or  a  sequence  of  double  steps 
(bias  =  «).  We  do  not  claim  that  our  setting  (bias  =  1)  is  optimal,  but  we 
doubt  that  it  is  far  off. 

7.  Numerical  Examples 

We  give  a  few  examples  which  contrast  the  performance  of  the  standard 
two  sided  Lanczos  algorithm  and  our  lookahead  variant.  They  illustruate 
the  stabilizing  effect  of  the  new  variant. 

As  mentioned  earlier  there  are  several  aspects  of  the  problem  which  we 
have  not  yet  addressed.  The  most  important  is  the  efficient  computation  of 
eigenvalues  of  J  .  All  our  computations  were  done  on  the  NOVA  840  of  the 
Remote  Sensing  Research  Program  at  the  University  of  California,  Berkeley. 
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This  forced  us  to  use  fairly  small  examples  and  this  permitted  us  to  use 
the  QR  programs  from  EISPACK  to  determine  J’s  eigenvalues. 

The  Look-ahead  algorithm  reduces  to  the  regular  algorithm  when  the  bias 
factor  =  0  .  We  kept  the  bias  =  1  for  all  our  examples. 

Explanation  of  the  Tables 


Each  table  gives  snapshots  of  a  Lanczos  run,  exhibiting  what  we  feel  is 
the  essential  information. 

I  -  the  order  of  the  J-matrix. 

,4>2  -  cosines  of  angles  between  various  candidates  for  p*  and 
.  See  section  4  for  precise  definition. 
p(J)  -  the  spectral  radius  of  J  . 

The  goal  of  our  algorithm  is  to  keep  ^  from  sudden  plunges  towards  0  . 

We  could  have  used  ||  J  j  j  instead  of  p(J)  ,  as  an  indication  of  "stability". 

We  fear  the  appearance  of  arbitrarily  large  "spurious"  eigenvalues  in  J  . 

We  expect  some  of  J's  eigenvalues  to  stabilize,  as  l  increases,  at  certain 
points  in  the  complex  plane.  These  points  should  be  part  of  B's  spectrum. 

If  a  double  step  occurs  in  the  Lookahead  algorithm  for  a  particular  value 
of  l  then  <j>i  and  ^  are  not  defined  at  £  +  1  and  such  places  are  indicated 
by  dashes. 

0  0  0  0  0  V 

1  0  0  0  0  0 

0  1  0  0  0  0 

0  0  1  0  0  0 

0  0  0  1  0  0 

0  0  0  0  1  0 

rl  s  S1  =  ^ ,2, 3, 4, 5, 6]* 


Example  I: 


B  « 


no.  of  steps  =  6 
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The  eigenvalues  of  B  are  the  sixth  roots  of  unity.  The  size  of  the  matrix 
allows  for  the  complete  history. 

TABLE  I 


Reg. 

Lanczos 

Look- 

■ahead  Lanczos 

i 

*1 

02 

p(J) 

*2 

n 

1.000 

.1277 

.8351 

1.000 

.1277 

.8351 

H 

.1281 

.0077 

1.503 

.1281 

.0077 

1.503 

.0072 

10**6 

1.012 

.0072 

10-* 

1.012 

M 

io-6 

io*6 

10"6 

.0488 

•  04oo 

10 

1U 

O 

- 

to-7 

4.781 

5 

6 

7 

1U 

.0068 

.0068 

1.000 

Comment: 

Steps  1  -  3  of  both  Regular  and  Look-ahead  Lanczos  are  identical.  At  step  4, 

-3 

Regular  Lanczos  normalizes  s^  and  r^  by  factors  of  10  ,  producing  elements 

in  J  of  size  106.  Step  5  of  Regular  Lanczos  is  then  aborted  because  the 
vectors  are  now  orthogonal  to  working  precision. 

The  Look-ahead  algorithm  avoids  the  large  element  growth  in  J,  by  doing  a 
double  step.  The  resulting  J-matrix  had  eigenvalues  identical  to  B  to 
working  accuracy. 

II.  B  is  12x12  block  upper  triangular  with  diagonal  blocks  shown  below. 

The  upper  elements  b^  were  randomly  distributed  in  [-5,  5]. 

'  J 
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OD 

U 

1  <£> 
ro  cr» 

2' 

95  J 

b2-  r 2 

L  L  95 

-95' 

9 

2. 

B3  =  99  ,  B4  =  B5  =  98  , 

B6  = 

'25 

.50 

-50' 

25  J  , 

CM 

1 

O 

II 

CO 

R  -  F50 
'  B®  L  50 

1 

cn  err 

O  O 
i  i 

r1  =  s1  =  [1,  ...  ,1]* 
No.  of  steps  taken  =  24. 


TABLE  II 


Reg. 

Lanczos 

Look- 

•ahead  Lanczos 

i 

<h 

*2 

0  ( J ) 

5I 

"2 

c(J) 

2 

.2075 

.6644 

269.3 

.2075 

,664^. 

1 

— “  1 

3 

.2101 

.20 59 

83.95 

— 

— 

83.95 

4 

.7592 

.4422 

IO8.5 

.7594 

.4421 

IO8.5 

5 

.4405 

.4718 

96.67 

.4403 

.4717 

6 

.3438 

.1787 

98.45 

— 

— 

98.45 

7 

.1304 

.0914 

98.40 

.1304 

.0914 

98.40 

8 

.1217 

.0641 

98.57 

.1217 

.0651 

98.57 

19 

.0012 

.0389 

O 

O 

.1386 

.0373 

107.2 

20 

.0012 

.0389 

1012 

.2115 

.2661 

— 

21 

.0012 

.0389 

10« 

— 

164.5 

22 

.0012 

.0389 

1011 

.1707 

.0109 

122.0 

23 

.0012 

.0389 

1014 

.0164 

.0706 

— 

24 

.0012 

.0389 

0 

00 

— 

— 

164.5 

lit.  v 


Comment  on  Example  II. 


B  is  fairly  far  from  normal.  After  24  steps,  7  of  J's  eigenvalues 
had  stabilized  to  between  3  and  7  decimal  places  using  the  Look-ahead 
algorithm.  What  is  surprising  is  that  the  regular  Lanczos  algorithm  pro¬ 
duced  J's  for  which  6  eigenvalues  had  stabilized  to  3  or  more  decimal 

18 

places  despite  elements  in  J  as  large  as  10  . 


III.  B  is  a  15  by  1 5  upper  triangular  matrix  with  evenly  spaced  real 

eigenvalues.  The  super  diagonal  elements  were  random  in  the  range  of 
[-10,  10].  The  diagonal  elements  were 


10j 

0 

10j 


75  ,  j  =  1,  ...  ,7, 

,  j  =  8, 

85  ,  j  =  9,  ...  ,15  . 


r -j  —  S-|  —  (1,1,  ...  ,1  )*. 


TABLE  III 


Reg. 

Lanczos 

*1 

*2 

P(J) 

.3205 

.O58O 

64.87 

.0646 

.2167 

154.2 

.0539 

.0535 

65.02 

.2843 

.2336 

65.OO 

.1651 

.2007 

65.00 

.1516 

.0447 

65.OO 

.0443 

.1934 

134.2 

.0391 

.0303 

160.0 

.0183 

.0262 

201.8 

.0438 

.0016 

249.1 

.0070 

.0896 

676.2 

.0072 

.0068 

281.8 

.0368 

.0176 

279.0 

.0269 

.0306 

290.6 

.0303 

.1304 

332.6 

Look-ahead  Lanczos 


.3205  .0580  64.87 

.0646  .216?  - 

- - 65.02 

.2839  .2353  65.00 

. 1659  • 1990  - 

— — 65.OO 

.1439  .1563  - 

-  65.OO 

.3650  .1736  65.00 

.2426  .4273  - 

-  63.OO 

.1691  .2296  - 

-  71.40 

.0704  .0284  65.OO 

.0518  .0114  111.7 


Comment:  The  run  was  halted  after  30  steps. 

LAL:  To  14  eigenvalues  of  B  there  were  approximations  good  to  between 
4  and  7  decimals,  including  duplicates  of  the  two  extremes. 


LAN:  To  4  eigenvalues  of  B  there  were  approximations  good  to  between 
4  and  7  decimals. 
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IV.  100  x  100  diagonal  matrix 

diag  (8)=  (1 ,2, ... ,20,41 ,62,. .. ,440,481 ,522,. .. ,1260,1321 ,1382,. . . 
2480, 2561, 2642,. ..,4100). 

r.j  ,s|  random  but  unequal 


TABLE  IV. 


Reg. 

Lanczos 

Look- 

-ahead  Lanczos 

i 

*1 

*2 

P(J) 

*1 

42 

P(J) 

4 

.0633 

.0037 

35*6. 

.0633 

.0037 

3546. 

5 

.0036 

.0163 

io4 

.0036 

.0163 

6 

.0035 

.0035 

4648. 

4669. 

7 

.0541 

.0493 

4489. 

.0521 

.0460 

4303. 

8 

.0536 

.0187 

4126. 

.0626 

.0444 

4219. 

9 

.0806 

.0625 

4866 

.0519 

.0027 

4129. 

10 

.0770 

.0582 

104 

.0029 

.0187 

— 

11 

.0823 

.0782 

io4 

4082. 

12 

.0815 

.0801 

105 

.0052 

.0158 

— 

13 

.0821 

.0819 

105 

4180. 

14 

.0819 

.0818 

105 

.0478 

.0367 

4180. 

15 

.0820 

.0820 

to6 

.0425 

.0204 

4099. 

16 

.0820 

.0820 

106 

.0225 

.0091 

4100. 

17 

.0820 

.0820 

107 

.0156 

.0042 

4102. 

18 

.0820 

.0820 

10  7 

.0089 

.0098 

— 

19 

.0820 

.0820 

10® 

4100. 

30  - 


Comment: 

After  19  steps  4  eigenvalues  of  B  were  represented  in  J  to  3  or  more 
decimal  places.  The  standard  Lanczos  lost  stability  and  no  eigenvalues  were 
represented  in  its  J  . 


8.  Conclusion 


The  Look-ahead  algorithm  is  more  complicated  than  the  standard  two-sided 
Lanczos  process.  In  return  it  seems  to  prevent  the  serious  breakdown  which 
afflicts  the  standard  program.  However,  much  more  testing  is  needed. 

Other  research  topics  which  are  relevant  to  a  satisfactory  solution  of 
the  eigenvalue  problem  for  large  nonnormal  matrices  are: 

(i)  the  use  of  reorthogonalization,  or  selective  orthogonalization. 
to  maintain  bi -orthogonal i ty. 

(ii)  the  use  of  the  Hyman-Laguerre  method  for  finding  eigenvalues  of  J. 

(iii)  the  use  of  the  LR  algorithm  with  interchanges  for  finding  eigen¬ 
values  of  J  . 

(iv)  incorporation  of  residual  error  bounds  as  termination  criteria. 

Saad  studies  alternative  techniques  which  produce  banded  Hessenberg 
J-matrices  in  [8]  and  he  gives  a  theoretical  treatment  of  the  standard 
two-sided  Lanczos  algorithm  in  [91. 
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